Boosting and Pocket Measurements

Author

Kris Sankaran

Published

February 18, 2026

Code

Motivation

library(tidyverse)
library(DALEX)
library(mlr3)
library(mlr3learners)
library(mlr3tuning)
library(mlr3pipelines)
library(xgboost)
set.seed(20260216)
theme_set(theme_classic())

This example is based on the article “Pockets” by Jan Diehm and Amber Thomas for the pudding. Their question – are pocket sizes systematically smaller in women’s pants, to the point that they wouldn’t be able to hold common objects like wallets or smartphones? They gathered data on pocket characteristics from major brands.

pockets <- read_csv("https://github.com/the-pudding/data/raw/refs/heads/master/pockets/measurements.csv")
pockets
# A tibble: 80 × 16
   brand        style  menWomen name  fabric price maxHeightFront minHeightFront
   <chr>        <chr>  <chr>    <chr> <chr>  <dbl>          <dbl>          <dbl>
 1 Arizona      skinny women    Fave… 78% c…  42             14.5           15  
 2 Arizona      strai… women    Perf… 78% c…  42             14.5           14  
 3 Ralph Lauren skinny women    Mode… 92% c…  89.5           13             13.5
 4 Ralph Lauren strai… women    Prem… 92% c…  89.5           13             13.5
 5 Uniqlo       skinny women    Skin… 87% c…  39.9           13             13  
 6 Uniqlo       strai… women    High… 98% c…  39.9           15.5           12  
 7 Calvin Klein skinny women    Midr… 98% c…  79.5           12             12  
 8 Calvin Klein strai… women    Stra… 85% c…  69.5           14             11.2
 9 Lucky        skinny women    Ava … 69% c…  99             13             14.5
10 Lucky        strai… women    Swee… 96% c…  79.5           15             16.5
# ℹ 70 more rows
# ℹ 8 more variables: rivetHeightFront <dbl>, maxWidthFront <dbl>,
#   minWidthFront <dbl>, maxHeightBack <dbl>, minHeightBack <dbl>,
#   maxWidthBack <dbl>, minWidthBack <dbl>, cutout <lgl>

Our approach is to predict whether pockts are men’s or women’s using all the available features. High accuracy would mean that pockets differ systematically in style, price, or pocket dimensions. We can then inspect variable importance to see which features drive the distinction.

Data Processing

We will convert character valued columns into to factors for easy encoding later on. We omit fabric and name (ID-like variables with many levels) and maxHeightFront (so straongly associated with the response that it would dominate interpretation and make this example less representative).

pockets <- pockets |>
    select(!any_of(c("fabric", "name", "maxHeightFront"))) |>
    mutate(across(where(is.character), as.factor))

Here’s the processed version of the data.

pockets
# A tibble: 80 × 13
   brand      style menWomen price minHeightFront rivetHeightFront maxWidthFront
   <fct>      <fct> <fct>    <dbl>          <dbl>            <dbl>         <dbl>
 1 Arizona    skin… women     42             15                6.5          16.5
 2 Arizona    stra… women     42             14                6.5          16  
 3 Ralph Lau… skin… women     89.5           13.5              6.5          14.5
 4 Ralph Lau… stra… women     89.5           13.5              6.5          14.5
 5 Uniqlo     skin… women     39.9           13                5.5          14  
 6 Uniqlo     stra… women     39.9           12                6.5          16.5
 7 Calvin Kl… skin… women     79.5           12                6.5          14.5
 8 Calvin Kl… stra… women     69.5           11.2              6.5          15.5
 9 Lucky      skin… women     99             14.5              6            13.5
10 Lucky      stra… women     79.5           16.5              5.5          15.5
# ℹ 70 more rows
# ℹ 6 more variables: minWidthFront <dbl>, maxHeightBack <dbl>,
#   minHeightBack <dbl>, maxWidthBack <dbl>, minWidthBack <dbl>, cutout <lgl>

Hyperparameter Tuning

Next, we define our hyperparameter tuning pipeline. The setup resembles the 03-gotv_cart.qmd analysis, but with two changes. First, we use classif.xgboost to specify that we want a boosting model. Second, we add a po("encode") preprocessing step to turn categorical features into numerical encodings. Although boosting can handle categorical types in theory, the xgboost implementation in mlr3 requires numerical predictors. We could do this manually, but the mlr3 pipeline manages it for us.

task <- as_task_classif(
    pockets,
    target = "menWomen"
)

learner <- as_learner(
    po("encode") %>>%
    lrn(
        "classif.xgboost",
        nrounds = to_tune(100, 300),
        max_depth = to_tune(1, 3),
        eta = to_tune(1e-3, 1e-1),
        predict_type = "prob"
    )
)

instance <- ti(
  task = task,
  learner = learner,
  resampling = rsmp("cv", folds = 2),
  measures = msr("classif.ce"),
  terminator = trm("none")
)

Now that we’ve setup the pipeline, let’s search over hyperparameters to find a reasonable choice.

tuner <- tnr("grid_search", resolution = 3, batch_size = 3)
tuner$optimize(instance)
   classif.xgboost.eta classif.xgboost.max_depth classif.xgboost.nrounds
                 <num>                     <int>                   <int>
1:                 0.1                         2                     100
   learner_param_vals  x_domain classif.ce
               <list>    <list>      <num>
1:          <list[7]> <list[3]>       0.15

The cross entropy shows that quick growing trees (with some interactions) did the best. Cross-entropy might not be as intuitive as accuracy, but values near 0 mean perfect classification and near -\log (0.5) \approx 0.69 mean random guessing. Our model does much better than that baseline.

as.data.table(instance$archive) |>
  rename_with(~ str_remove(., "classif.xgboost.")) |>
  arrange(classif.ce) |>
  select(1:4)
       eta max_depth nrounds classif.ce
     <num>     <int>   <int>      <num>
 1: 0.1000         2     100     0.1500
 2: 0.0505         1     200     0.1625
 3: 0.1000         3     200     0.1625
 4: 0.1000         2     200     0.1625
 5: 0.0505         3     300     0.1625
 6: 0.0505         2     200     0.1625
 7: 0.0505         1     300     0.1625
 8: 0.0505         3     200     0.1625
 9: 0.1000         3     100     0.1625
10: 0.0505         2     300     0.1750
11: 0.1000         1     100     0.1750
12: 0.1000         1     200     0.1750
13: 0.1000         2     300     0.1750
14: 0.0505         2     100     0.1750
15: 0.1000         3     300     0.1750
16: 0.0505         1     100     0.1750
17: 0.0010         1     300     0.1875
18: 0.1000         1     300     0.1875
19: 0.0505         3     100     0.1875
20: 0.0010         3     200     0.2750
21: 0.0010         2     200     0.2750
22: 0.0010         2     300     0.2750
23: 0.0010         3     300     0.2750
24: 0.0010         1     100     0.5750
25: 0.0010         3     100     0.5750
26: 0.0010         1     200     0.5750
27: 0.0010         2     100     0.5750
       eta max_depth nrounds classif.ce
     <num>     <int>   <int>      <num>

Refit Model

Let’s refit the model on the entire dataset (so far, we’ve been training on CV split versions). This is the model that we would use in practice, and it’s what we’ll interpret next.

learner$param_set$values <- instance$result_learner_param_vals
learner$train(task) # retrain on all data
fit <- learner$model

Interpretation

A simple check: compare predicted probabilities against the actual class. The probabilities barely overlap, meaning that the measured features predict men’s vs. women’s pocket type quite reliably.

The xgboost package gives its own variable importance measure. This looks at the variables which were split most often during training, as well as how much they increased node purity when they were split. We can see tha tthe top four variables are all related to the dimensions of hte pocket.

fit <- learner$model$classif.xgboost$model
fit <- attr(fit[[1]], "model") # messy, but seems to be only way to access original model....
xgb.importance(model = fit)
             Feature        Gain       Cover   Frequency
              <char>       <num>       <num>       <num>
 1:   minHeightFront 0.621449765 0.360998572 0.270042194
 2:    minWidthFront 0.123103109 0.175654586 0.185654008
 3:    maxHeightBack 0.079520156 0.136094368 0.147679325
 4: rivetHeightFront 0.060789738 0.077658123 0.075949367
 5:           cutout 0.042073073 0.056660149 0.054852321
 6:            price 0.031754325 0.085450628 0.126582278
 7:    minHeightBack 0.025086866 0.048867292 0.050632911
 8:     maxWidthBack 0.006768666 0.032342490 0.046413502
 9:     minWidthBack 0.003424694 0.017630925 0.033755274
10:    maxWidthFront 0.003153266 0.003200917 0.004219409
11:       style.slim 0.002876342 0.005441949 0.004219409

Finally, partial dependence plots. Each line represents one pocket from the data. By passing a custom predict_function argument, we can study changes in predicted probabilities rather than just class labels. Each curve shows how a pocket’s predicted probability shifts when we intervene on the plotted variables. The effect of minHeightFront is especially clear: once the minimum pocket height exceeds about 15, the model predicts men’s pocket.

explainer <- DALEX::explain(
  learner,
  data = pockets,
  y = as.numeric(pockets$menWomen),
  predict_function = \(model, newdata) {
    model$predict_newdata(newdata)$prob[, 1]
  }
)
Preparation of a new explainer is initiated
  -> model label       :  R6  (  default  )
  -> data              :  80  rows  13  cols 
  -> data              :  tibble converted into a data.frame 
  -> target variable   :  80  values 
  -> predict function  :  function(model, newdata) {     model$predict_newdata(newdata)$prob[, 1] } 
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package Model of class: GraphLearner package unrecognized , ver. Unknown , task regression (  default  ) 
  -> predicted values  :  numerical, min =  0.006793982 , mean =  0.5003302 , max =  0.9943342  
  -> residual function :  difference between y and yhat (  default  )
  -> residuals         :  numerical, min =  0.005665779 , mean =  0.9996698 , max =  1.993206  
  A new explainer has been created!  
pdp <- model_profile(explainer, variables = c("minHeightFront", "minWidthFront", "rivetHeightFront"))
plot(pdp, geom = "profiles")

Indeed, if we go back to the raw data, we can see a clear difference in these variables between mens and women’s pockets.